Disputation
16 April 2024
University of Mannheim
Which methods can we use to classify data from open-ended survey questions?
Can we leverage these methods to make empirical contributions to substantial questions?
1️⃣ Increase in methods to collect natural language (e.g., smartphone surveys with voice technologies) requires the evaluation of available classification methods.
2️⃣ Special structure of open-ended survey answers (e.g., shortness, lack of context) requires the testing of machine learning methods for the survey context.
3️⃣ Open answers have the potential to equip researchers with rich data useful for various subjects of research.
| Study 1 | Study 2 | Study 3 |
|---|---|---|
| How valid are trust survey measures? New insights from open-ended probing data and supervised machine learning | Open-ended survey questions: A comparison of information content in text and audio response formats | Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys? |
| Research Fields | ||
|---|---|---|
| Measurement equivalence | Questionnaire Design | Emotion Analysis |
Background: ongoing debates about which type of trust survey researchers are measuring with traditional survey items (i.e., equivalence debate cf. Bauer & Freitag 2018)
Research Question: How valid are traditional trust survey measures?
Questionnaire Design: 5 open-ended questions per respondent, block-randomized order
Data: U.S. non-probability sample; \(n\)=1,500 with 7,497 open answers
Supervised classification approach:
| ID | Measure | Trust | Probing Answer | Associations (known others) | Associations (sentiment) |
| 123 | Most people | 0.33 | I was thinking of people I don’t know personally. | 0 (No) | 0 (neutral/positive) |
| 3139 | Most people | 0.17 | Tourists that come to our little village. I tend to be very wary of them. | 0 (No) | 1 (negative) |
| 2980 | Stranger | 0 | No one in particular, but I don’t think I could trust anyone ever again. | 0 (No) | 1 (negative) |
| 4286 | Watching a loved one | 0 | A former neighbor of mine who was a single father with a son close to my son’s age. | 1 (Yes) | 0 (neutral/positive) |
Background: requests for spoken answers are assumed to trigger an open narration with more intuitive and spontaneous answers (e.g., Gavras et al. 2022)
Research Question: Are there differences in information content between responses given in voice and text formats?
Experimental Design: random assignment into either the text or voice condition
Operationalization of information content in open answers via application of measures from information theory and machine learning
Questionnaire Design: 9 open-ended questions per respondent, block-randomized order
Data: U.S. non-probability sample; \(n\)=1,461 with \(n_{text}\)=800 and \(n_{audio}\)=661
Background: conventional notion stating that trust originates from informed, rational, and consequential judgments is challenged by the idea of an “affect-based” form of (political) trust (e.g., Theiss-Morse and Barton 2017)
Research Question: Are individual trust judgments in surveys driven by affective rationales?
Questionnaire Design: voice condition only
LLMs allow modeling with fewer training data and domain-specific knowledge (fine-tuning and prompting techniques)
Bauer, P. C., and M. Freitag. 2018. “Measuring Trust.” Pp. 1–27 in The Oxford Handbook of Social and Political Trust, edited by E. M. Uslaner. Oxford University Press.
Gavras, K. et al. 2022. “Innovating the collection of open-ended answers: The linguistic and content characteristics of written and oral answers to political attitude questions.” Journal of the Royal Statistical Society. Series A, 185(3):872-890.
Pérez, J. et al. 2023. “Pysentimiento: A Python Toolkit for Opinion Mining and Social NLP Tasks.” arXiv.
Ravanelli, M. et al. 2021. “SpeechBrain: A General-Purpose Speech Toolkit.” arXiv
Theiss-Morse, E., and D. Barton. 2017. “Emotion, Cognition, and Political Trust.” Pp. 160–75 in Handbook on Political Trust. Edward Elgar Publishing.